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1. Introduction 

In the past few years, there has been considerable research on concurrency control, including 
both systems design and theoretical study. The problem Is roughly as follows. Data in a large 
(centralized or distributed) database is assumed to be accessible to users via transactions , each of 
which is a sequential program which can carry out many steps accessing individual data objects. It is 
important that the transactions appear to execute "atomically", i.e. without intervening steps of other 
transactions. However, it is also desirable to permit as much concurrent operation of different 
transactions as possible, for efficiency. Thus, it is not generally feasible to insist that transactions run 
completely serially. A notion of equivalence for executions is defined, where two executions are 
equivalent provided they "look the same" to all transactions and to all data objects. The serializable 
executions are just those which are equivalent to serial executions. One goal of concurrency control 
design is to insure that all executions of transactions be serializable. 

Several characterization theorems have been proved for serializability; generally, they amount to 
the absence of cycles in some relation describing the dependencies among the steps of the 
transactions. A very large number of concurrency control algorithms have been devised. Typical 
algorithms are those based on two-phase locking [EQLTJ, and those based on timestamps [La]. 
Although many of these algorithms are very different from each other, fjiey can all be shown to be 
correct concurrency control algorithms. The correctness proofs depend on the absence-of-cycles 
characterizations for serializability. 

More recently, it has been suggested {Re, M, LiSJ that some additional structure on transactions 
might be useful for programming distributed databases, and even for programming more general 
distributed systems. The suggested structure permits transactions to be nested. Thus, a transaction 
is not necessarily a sequential program, but rather can consist of (sequential or concurrent) sub- 
transactions. The intention is that the sub-transactions are to be serialized with respect to each 
other, but the order of serialization need not be completely specified by the writer of the transaction. 
This flexibility allows more concurrency in the implementation than would be possible with a single- 
level transaction structure consisting of sequential transactions. The general structure allows 
• transactions to be nested to any depth, with only the leaves of the nesting tree actually performing 
accesses to data. 

Transactions are often used not only as a unit of concurrency, but also as a unit of recovery. In a 
nested transaction structure, it is natural to try to localize the effects of failures within the closest 
possible level of nesting in the transaction nesting tree. One is naturally led to a style of programming 
which permits a transaction to create children, and to tolerate the reported failure of some of its 



children, using the information about the occurrence of the failures to decide on its further activity. 
The intention is that failed transactions are to have no effect on the data or on other transactions. 
This style of programming is a generalization of the "recovery block" style of [Raj to the domain of 
concurrent programming. Indeed, this style seems to be especially suitable for programming 
distributed systems, since many types of failures of pieces of programs are likely ta occur in such 
systems. 

Reed [Re] has designed an algorithm which uses multiple versions of data to implement nested 
transactions. Moss [M] has abstracted away from Reed's specific implementation of nested 
transactions, presenting a general description of the nested transaction model. He has also 
developed an alternative implementation of the nested transaction model, based on two-phase 
locking. This model and implementation are fundamental to the Argus distributed computing 
language, now under development by Liskov's group at MIT [LiSJ. 

The basic correctness criteria for nested transactions seem to be clear enough, intuitively, to 
allow implementors a sufficient understanding of the requirements for their implementation. 
However, some subtle issues of correctness have arisen in connection with tiie behavior of failed 
sub- transactions. For example, the Argus group has decided that a pleasant property for an 
implementation to have is that all transactions, including even "orphans" (subtransactions of failed 
transactions), should see "consistent" views of the data (i.e. views that could occur during an 
execution in which they are not orphans). The implementation goes to considerable lengths to try to 
insure this property, but it is difficult for the implementors to be sure that they have succeeded. 

It seems clear that some basic groundwork is needed before such properties can be proved. 
Namely, the theory already developed for concurrency control of single- level transaction systems 
without failures needs to be generalized to incorporate considerations of nesting and failures. The 
model needs to be formal, in order to allow careful specification of all the correctness requirements - 
the simple and intuitive ones, as well as the rather subtle ones. 

This paper begins to develop this groundwork. First, a simple "action tree" structure is defined, 
which describes the ancestor relationships among executing transactions and also describes the 
views which different transactions have of the data. A generalization of serializability to the domain of 
nested transactions with failures, is defined. A characterization is given for this generalization of 
serializability, in terms of absence of cycles in an appropriate dependency relation on transactions. A 
slightly simplified version of Moss' algorithm is presented in detail, and a correctness proof is given. 



The correctness proof is complete, detailed, and rigorous. Its style appears to be quite interesting 
in its own right. Producing such a proof was a very difficult task; the main issues that made it so 
difficult were the nesting of transactions and the possible failures of subtransactions. The initial 
attempts to develop such a proof led to extremely complicated, non- modular constructions. 
Gradually, after we had tried for many months to organize the proof, the uniform general proof 
structure presented in this paper began to emerge. This structure allows the proof to be decomposed 
in a very natural way. Without this structure, it is doubtful that we would have been able to complete a 
proof at all. (We know of few comparably successful complete proofs for difficult distributed 
algorithms.) 

The proof is based on certain algebras, which we caH "event-state" algebras. An event-state 
algebra is an abstract description of a computing system and the protocol Wat governs its behavior. 
The elements of the algebra are states of the computing system. An operation of the algebra is an 
"event" of the system, i.e. a computation step; if transforms a state to another state. The operations 
are only partially defined, in correspondence with -the fact that a step might not be applicable to all 
states. The rules that specify when an operation is defined correspond to the algorithm or protocol 
that controls the execution of the system. 

Another important concept for our proof is the notion of a mapping between algebras. It is useful 
to describe a computing system on several different levels of abstraction, i.e. as several distinct 
algebras. A mapping from an algebra X to another algebra S is a "simulation" of ft by J. provided 
that every valid computation of A is mapped to a valid computation of ft. Thus, J. is, in a sense, an 
"implementation" of ft. 

The approach taken in this paper to a correctness proof of Moss* algorithm is the following. The 
system governed by the algorithm la described by a succession of algebras, each one describing 
more specific details about the algorithm and its implementation, th the highest level algebra, the only 
precondition for the applicability of a step (an operation) is that it preserve global correctness. This 
algebra is quite far from the algorithm itself. As a matter of fact, this algebra represents "what needs 
to be achieved" by the system. Successive algebras get closer to the algorithm, i.e. to "how it is 
achieved". Showing the existence of a simulation mapping between each pair of successive levels, is 
the heart of the correctness proof. 

One novel aspect of the simulations we use, different from the usual notions of "abstraction" 
mappings, is that our simulations map single lower level states to sets of higher level states, rather 
than just single higher level states. (We call them "possibilities" mappings.) This extra flexibility 



seems quite convenient for many implementations, allowing the lower level algebra sometimes to 
contain less detail than the higher level algebra. For example, it might be easy to prove correctness 
of an algorithm which maintains lots of auxiliary data. The correctness of an algorithm which 
contains less detail could be proved, in our model, by showing that it simulates the algorithm which 
maintains the auxiliary data. 

While possibilities mappings are convenient for proving correctness of ordinary centralized 
algorithms, they produce their greatest payoff for distributed algorithms. Namely, a distributed 
algorithm is described as a special case of an event-state algebra, a "distributed algebra". A 
distributed algebra has a set of "components". The state set for the algebra is just a Cartesian 
product of local states, one for each component. The events are partitioned among the set of 
components, according to which component is assumed to "perform" the event Event domains and 
transitions are defined componentwise. To show ftat a distributed algebra simulates some other 
"abstract" algebra, it suffices to define an appropriate possibilities mapping from the global states of 
the distributed algebra, to sets of states of the abstract algebra. It turns out to be extremely natural to 
describe such a mapping by first describing a possibilities mapping from the local state of each 
component to sets of abstract states. The image of a local state under this mapping just represents 
the set of possible global states consistent with the knowledge of the particular component. The 
possibilities for the entire distributed algebra are simply obtained by biking the intersection of the 
possibilities consistent with the knowledge of all the components. 

It appears that this technique extends to give natural proofs of many algorithms, especially 
distributed algorithms, and thus warrants further investigation. Goree [G] presents a slightly more 
general development of the technique than is presented in this paper, but more remains to be done. 

The concurrency control definitions given in this paper express tt» most fundamental correctness 
requirements, but not subtle conditions such as correctness of orphans' views. Issues of fairness and 
eventual progress are not addressed, but rather only safety properties, serializability in particular. 
Future work involves extending the framework presented here to aHow expression of these other 
, properties, and to allow correctness proofs for the difficult algorithms which guarantee these 
properties. Some further work in these directions has already been carried out: Goree [G] gives a 
definition for correctness of orphans' views, and has given a correctness proof for a complicated 
algorithm used in the implementation of Argus to maintain correctness of orphans' views in the face 
of transaction aborts. 

A related recent paper [B] also addresses the problem of proving correctness of algorithms 



implementing nested transactions. However, that paper does not address issues of failure and 
recovery, which are primary considerations of the present paper. Also, the kind of nesting they 
consider appears to be somewhat different from ours: it appears to be designed primarily for 
describing levels of data abstraction. Finally, the proof techniques of [BBGLS] are quite different 
from ours. 

Although our variant of Moss' algorithm is described completely in this paper, we urge the 
interested reader to read Moss' presentation in [M]. His presentation gives useful background and 
context for the algorithm, as well as a much more intuitive description of the algorithm than is 
presented here. 

2. Event-State Algebras 

In this section, we describe the event-state algebra framework. This framework is used in the later 
sections to organize the formal correctness proof for Moss' algorithm. The algorithm is described in a 
series of five levels, each of which is described as an event-state algebra. 

The reader who is mainly interested in the formal model for nested transactions, and in Moss' 
algorithm, rather than in proofs of concurrent algorithms, can safely skim the contents of this section. 

2.1 . Algebras and Simulations 

We begin with the basic algebra definitions. An event-state al gebra , J. » <A, a, JT>, consists of a 
set A of siatss, an element a € A, the initia l stajte . and a set U of partial unary operations (the events) . 
In this paper, we will usually refer to an event-state algebra as simply an algebra . 

Next, we give standard definitions for computability concepts. For any event n, we let domainM 

denote the set of states for which » is defined. Let a be a state, and let ♦ » (*, v k ) be any finite 

sequence of events chosen from II. Then * is said to be valid from a provided b = 
w k (» k . 1 (...(» 1 (a))...)) is defined (i.e. provided that v^ t (...(« 1 (a))...) is in domain^), for for all i, 1 £ i £ 
k). In this case, b is called the result of * applied to a. An infinite sequence of events is said to be 
valid from a provided all its finite prefixes are valid from a. We say that $ is yaljd provided it is valid 
from a, and the result of * is defined to be the result of $ applied to a. We write a h- b provided there 
is some finite 4>, valid from a, for which b is the result of * applied to a, b is computable provided a h- 
b. 

In order to decompose our proof into levels of abstraction, we require a definition of "simulation" 
of an algebra J. = <A, a, n> by another algebra JL' ■ <A', a', IT>. In this paper, we present a very 



weak definition. An interpretation of X by X' is a mapping h: IT -► TI U {A}. (Here, A represents a 
null event.) We extend h to a homomorphism mapping event sequences of X' to event sequences of 
X in the obvious way (deleting occurrences of A). An interpretation, h, is a simulation of X by X' 
provided that h(4>') is a valid event sequence for X whenever $' is a valid event sequence for JL'. 

We note that these definitions do not rule out certain trivial situations. We have not imposed the 
general requirement that JL' include a representation of every event in JL. We have also not imposed 
any requirements that events of JL' be defined on large domains. Thus, our techniques are not 
powerful enough to prove that .4.' does everything which is required to implement JL correctly; rattier, 
we assume that JL' is given, and we are to prove that everything it does is correct for X. We believe 
that the more powerful techniques required to insure the stronger properties require extra machinery, 
and a more sophisticated general theory than we wish to present here. 

The first lemma gives a basic composition result. This lemma justifies our composition of 
simulation results for adjacent levels, to prove a simulation result for non-adjacent levels. 

Lemma 1 : Assume that X, X' and X" are algebras, that h is a simulation of X by X' 
and h' is a simulation of X' by X". Then h • h' is a simulation of X by X". 
Proof: Straightforward. 



2.2. Possibilities Mappings 

Our basic method for proving correctness is showing tiiat simulations exists between adjacent 
members of a sequence of algebras. Therefore, we need a toot that can be used to show that a 
mapping is a simulation. In this subsection, we give a sufficient condition for a mapping h from X' to 
J. to be a simulation. The condition involves defining a correspondence between states of the two 
algebras, in addition to events. It turns out to be most convenient, for the reasons discussed in the 
Introduction, to allow the state mapping to map a single state of X' to a set of states of X rather than 
just to a single state. The states in such a set are called "possibilities" - i.e., the "possible" states 
corresponding to a given state. If we think of X' as a "concrete" algebra, and X as a more "abstract" 
algebra, then we see that a possibilities mapping allows single "concrete" states to be mapped to sets 
of "abstract" states rather than just single abstract states. 

Let h: A'Un'-*3(A)UnU {A} be such that h(a') € 31A) for all a' € A", and h restricted to IT is 
an interpretation, i.e. h(w') € n U {X} for all w' € IT. (Here, ^denotes the power set.) Then h is a 
possibilities mapping from X' to X provided the following are true: 



(a) a € h(ff'). 

Assume a and a' are computable in jL and A', respectively, and a € h(a'). Assume w' € II'. 
Assume a' € domain(ir ') and b' = w'(a'). 

(b) If h(w') = ir € n, then a € domain(w). 

(c) If hdr') = w € n, then »(a) € h(b'). 

(d) If h(ir') = A, then a € h(b'). 

Property (a) says that the initial state of X is among the possibilities for the initial state of- J.'. 
Property (b) says that an event is only performed in JC when its image event can be performed in JL. 
Properties (c) and (d) say that events performed in X 1 preserve possibilities. The following diagram 
should be helpful in understanding fb) and (c). A similar diagram can be drawn to illustrate (d). 




Figure 1 : A Property of Possibilities Maps 

The following lemmas show that any possibilities mapping is a simulation. 

Lemma 2: Let h be a possibilities mapping from .X' to JL If $' is a valid event 
sequence for JV, and h($') * $, then * is a valid event sequence for J.. In addition, if $' is 
finite, a' is the result of 0* and a is the result of $, then a € h(a'). 

Proof: By induction on the length of*'. 



Lemma 3: Any possibilities mapping from JL' to X is a simulation of J. by X\ 
Proof: Immediate by Lemma 2. 



□ 
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2.3. Distributed Algebras 

Next, we define a special kind of event-state algebra, called a "distributed algebra". A distributed 
algebra is one which can be decomposed into components in a simple way: the states are Cartesian 
products of states for the components, each event is assumed to be originated by some particular 
component (although it can affect other components), and the definability and effects of events are 
locally determined. Such an algebra provides a natural structure for describing distributed 
algorithms. Processors in a network and message systems are typical examples of components in 
such a decomposition. 

An algebra, X = <A, a, n>, is said to be distributed over a finite index set I using d, provided that A 
is the Cartesian product of sets A., i € I, d is a mapping, d: Fl — ► I, giving the "doer" of each event, and 
the following two conditions are satisfied. 

- (Local Domain) Let i = d(*). Jf a, b € A and a, * b,, then a £ domain(ir) if and only if b € 
domain(w). 

- (Local Changes) If a, b € domain(w), a' = w(a), b' = w(b) and a, ■ b jf then a'j <* b\. 

The local domain property says that the state of the doer of an event determines the definability of 
that event. The local change property says that the changes caused by an event are defined 
componentwise. Note that in the local change property, the component i need not necessarily be the 
doer of w; we permit other components to be affected by ir, but assume that the effect is uniquely 
determined by ir and the state of the component. Strictly speaking, we could have omitted mention of 
both of these properties in this paper, since they are not needed to prove the one simple result we 
obtain (Lemma 4) about distributed algebras. However, the properties seem to describe the locality 
structure of distributed algorithms quite accurately, and so we present them in anticipation of further 
study. 

It happens that there is a particularly natural way to define a possibilities mapping from a 
distributed algebra to another algebra. Namely, we define a "local mapping", from the local state of 
each component of the distributed algebra to a set of abstract states. The result of this mapping 
should be thought of as the set of possible abstract states, as far as a particular component can tell 
from its local knowledge. The mapping from a global state of the distributed algebra can then be 
defined to yield the intersection of the images of all the component states. The conditions we require 
for local mappings are chosen to be sufficient to guarantee that the derived global mapping is a 
possibilities mapping. 



Let JV = <A\ a', n'> be an algebra, distributed over I using d. Let JL = <A, a, n> be any algebra. 
Let h be an interpretation from JL' to JL. For each i € I, let h.: A' -» 9(A) be such that h. depends on A', 
only - i.e. if a s = b } then h^a) = h ( (b). Then we say that h and h (l i € I, form a local mapping from JL' to 
JL provided the following conditions are satisfied. 

(aJForallieuehjfa'). 

Fix any i € I (for properties (b)-(d)). Assume a and a' are computable in JL and JV, respectively, 
and a € h^a'). Assume w' € IT, d(w') * i. Assume a* C'ddmarinfrr 1 ), aiKl b' ■ »'(a'). 

(b) If h(«r') « » € H, then a € domafn(»). 

Fix (for properties (c) and (d)) any j € I. (This j can be the same as or different from i.) 

(c) Assume h(»') » # € n and a € h.(a'). Then w(a) € hflf). 

(d) Assume h(»') ■ A and a € h.(a'), Then a € hXb'). 

That is, (a) says that the initial state of X is in the set of possibilities for each component's initial 
state. Property (b) says that an event Is only performed in JL' when its doer knows that its image event 
can be performed in JL. Properties (c) and (d) consider the situation from the point of view of an 
arbitrary component j. Property (c) says that an event with doer i preserves possibilities at component 
j. Property (d) is analogous to (c), for events whose images are null events. 

The following figure illustrates property (b). 

* 




"> 



b' 
Figure 2: A Property of Local Mappings 
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The following figure illustrates property (c). 

ir 





Figure 3: Another Property of Local Mappings 
The following lemma shows that local mappings yield possibilities mappings. 

Lemma 4: Let JL and JL' = <A\ a\ IT> be algebras, where A' is distributed over 
I. Assume that h and h., i € I form a local mapping from J.' to J.. Extend h to A' U IT by 
defining h(a') ■ n. ^ ^(a'). Then h is a possibilities mapping from JL' to JL (and therefore 
a simulation of JL by J.'). 

Proof: We check the four properties of the possibilities mapping definition. 

(a) To see that a € h(a'), it suffices to show that a € hj(a') for ail i € I. But this is exactly 
the statement of property (a) of the local mapping definition. 

Now we assume the hypotheses supplied for parte (b)-(d) of the possibilities mapping 
definition. Assume also that d(ir') = i. 

(b) Since a € h(a'), it Is obvious that a € h.(a'). Property (b) of the local mapping 
definition implies that a € domain(»). 



(c) In order to show that w(a) € h(b'), it suffices to fix an arbitrary j € I and show that 
i) € hj(b 
definition. 



w(a) € h.(b'). Since a € h^a'), the needed property follows from (c) of the local mapping 



(d) It suffices to show that a € hj(b') for any j € I. This foflows as in the preceding 
argument from (d) of the local mapping definition. 



If the definitions in this section are to be used in correctness proofs for the widest possible class 
of algorithms, they will probably need to be generalized. In particular, it seems appropriate to permit 
single events of a more concrete algebra to interpret sequences of events of a more abstract algebra. 
(See Goree [G] for definitions and uses for this generalization.) Also, allowing each algebra to have a 
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set of initial states rather than just a single initial state would probably be useful. Since we do not 
need these generalizations here, we do not make these extensions. 

3. Action Trees 

In this section, we provide the basic definitions needed to describe properties of nested 
transactions. The definitions in this section describe a particular data structure, called an "action 
tree", which provides a natural representation of nested transactions, the relationships between 
them, and their views of data. We define "serializability" in terms of action trees. We also prove 
several very basic lemmas about the definitions. 

We caution the reader that there are many definitions in this section, and he should not try to 
remember them all. Rather, we suggest that he read the definifions once for familiarity, and then use 
the section for later reference. 

In the rest of the paper, we often refer to transactions as just "actions", for brevity. This departure 
from the usual conventions of database theory has been made for consistency with the Argus work. 

3.1. Objects and Actions 

The system is assumed to contain a set of data objects, upon which the nested actions operate. 
We begin with some definitions for objects. Let obi be a universal set of data objects. For each x € 
obj, let values(x) denote the set of values x can assume, including a distinguished initial value initfx) . 
A value assignment is a total mapping, f, from obj to values(obj), having the property that f(x) € 
values(x) for all x € obj. 

Next, we give basic definitions for actions. In this paper, we have chosen to avoid modelling 
transactions explicitly, with a particular programming model. Rather, we have attempted to extract 
from such a model, just that information which is needed for concurrency control theory. 

Let ad be a universal set of actions. Let Lj be a distinguished action. We assume that the actions 
are configured a priori into a tree, representing their nesting relationship, with U as the root. For 
every A € act - {U}, let parent(A) denote a unique parent action for A. Let siblings denote {(A,B) € 
act 2 : earent(A) * parent(B)}. If A € act tot children^ A) denote {B € act: oarehtffll - A}. If A, B€ 
act, let lca(A.B) denote the least common ancestor of A and B. If A € act, let desc(A) (resp. anc(A) ) be 
the set of descendants (resp. ancestors) of A. Let prpper-despf A ) (reap. oroper-anc(A) ) be the set of 
proper descendants (resp. ancestors) of A. 
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It might be convenient for the reader to think of this a priori configuration of all possible actions 
into a tree as a preassigned "naming scheme" for actions. That is, the "name" of any action is 
assumed to carry within it information which locates that action in this universal tree of actions. In 
any particular execution, only some of these possible actions will be "activated". The (virtual) action 
U, the parent of all top-level actions, has been added for the sake of uniformity. Its presence provides 
a simplification in many arguments. 

We assume a priori determination of which actions actually access data, which objects they 
access and the functions they perform on those objects. Namely, let accesses denote the leaves of 
the tree described above. It is exactly these actions which access data. (We assume that U € 
accesses, so that the entire set of actions is nontrivial.) Let object : accesses — » obj be a fixed 
function. If object(A) = x, we say that A a 3D. aQS^a fox. For A € accesses, let uodatefA) : 
values(object(A)) -» values(object(A)) be a fixed function, describing the change made by A to its 
object. Let sameobiect denote {(A,B) € accesses 2 : object(A) * object (B)}. 

It might at first appear mat our model does not permit updates to depend on previous steps 
executed by a transaction. This is not our intention. Dependence on previous steps is modelled by 
our choice of a particular access: the "name" of the access is assumed to carry information about 
previous steps executed by a transaction. 

Note that the usual read and write operations of serializability theory can be regarded as special 
cases of accesses. Namely, "read accesses" have the identity function as their associated update 
function, while "write accesses" have an associated update function which is a constant function. 

3.2. Action Trees 

Next, we give a way of describing a "snapshot" of a particular execution, using a structure called 
an "action tree". An action tree can be regarded as the generaliiation of the log from ordinary 
serializability theory. The information .captured in an action tree includes which actions have been 
"activated", what the status of each such action is (fee. active, committed or aborted), and what value 
of its data object was seen by each access. 

An action t[S& T has components vertices ,., active ,-. committetL .. aborted, ,, and label r where 

• vertices T is a finite subset of act, closed under the parent operation: if A £ vertices,. - {U}, then 
parent(A) € vertices,., (These represent the actions which have ever been created during the current 
execution.) 
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- active T , committed T and aborted.,, comprise a partition of vertices.,-, (These classifications 
indicate the current status of each action that has ever been created. When a non-access action is 
first created, it is classified as active. At some later time, its classification can be changed to either 
committed or aborted. By "committed", we mean that the action is committed relative to its parent, 
but not necessarily committed permanently. Permanent commit of an action would be represented by 
classification of all ancestors of the action, except for U, as committed. Section 3.4 contains 
definitions and a lemma about permanent commit of actions.) 

- label T : datasteps T -> values(obj), (where datasteo^ ■ committed T D accesses), with label T (A) 
€ values (object(A)). (The label of an access to an object is intended to represent the value read by 
that access. Since the access has an associated function, the value which the access writes into the 
object is deducible from the value read, and therefore need not be explicitly represented. As a 
technical convenience, we do not assign a label to accesses until they become committed.) 

The following definitions are just convenient shorthand for concepts already defined. Let done T 
denote committed T U aborted r Let status T be defined by status ^A) ■ 'active' (resp. 'committed', 
'aborted') provided A € active T (resp. committed T ,.aborted T ). Let accesses^ . ■ vertices^. D accesses, 
accesses ^) » {B € accesses,.: object(B) = x}, and datasteps^ x) ■ {B € datasteps^ object(B) = 
x}. 

3.3. Visibility 

Next, we give a very important definition which helps to describe the "views" which actions have, 
of each other and of the data. In particular, this definition allows us to describe actions whose 
existence is intended to be fcnown to other actions (i.e. not masked from those other actions by 
intervening failures or active actions). For A € vertice&p let visibte T (A) denote {B € vertices,. : anc(B) 
n proper-desc(lca(A,B)) C committed T }. That is, visible T (A) is just the set of actions whose existence 
is potentially known to action A, because they and all their ancestors, up to and not including some 
ancestor of A, have committed (to their parents). Action A will be permitted to see the results of 
updates made by the transactions in visible^-A), and no others. For A € vertices,., x € obj, let 
. visible^A.x) denote visible T (A) D datasteps^x). The following lemma describes elementary 

properties of "visibility". 

Lemma 5: Let T be an action tree, A, B, C € vertices,. 

a. If B € desc(A), then A € visibte^B). 

b. A € visible T (B) if and onfy if A € visible T (lca(A,B)). 

c. If A € visible T (B) and B € visible T (C), then A € visiblft^C). 
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d. If A € desc(B) and C € visible T (B), then C € \risJble T (A). 

e. If A € desc(B) and A € visible T (C), then B € visibte T (C). 
Proof: 

a. Immediate. 

b. Immediate from the fact that lca(A,B) * lca(A,lca(A,B)). 

c. Let D € anc(A) D proper-desc(lca(A,C)). 
We must show that D € committed.,.. 

If D € proper-desc(!ea(A,B)), then the fact that 

A € visible T (B) implies the result. 

So assume that D C proper -desc{lca(A,B)). 

It must be the case that D € anc(lca(A,B)), 

and that tea(B t C) * tea(A,C). 

Thus, £ anc(B) n proper-desc(lca(B,C», so 

the fact that B € visible T (C) implies the result. 

d. Immediate from parts a and c. 

e. Immediate from parts a and c. 



A related definition allows us to describe actions which are capable of "committing up to the top 
level". If A € vertices T , then we say A is |jyj» in T provided anc(A) n aborted.,. = 0, and we say A js. 
dead in T otherwise. 

Lemma 6: If A, B € vertices^ A is live in T, and B € visible T (A), then B is live in T. 
Proof: If B is dead in T, then there exists C € ane(B) n aborted.,.. We know C t 
proper-desc(lca(A,B)), since B € vistble T (A). Thus, C € anc{lca(A,B)) Q anc(A), so A is 
dead in T, a contradiction. 

D 

3.4. Serializability 

In this subsection, we develop the basic correctness condition for action trees: serializability. 

First, we define the result of applying a sequence of steps to a data object. If x € obj and s is a 
finite sequence of datasteps, then we define resultfx.sV as follows- If s is the empty sequence, then 
result(x,s) = init(x). Otherwise, let s = s'A. Then resuU(x,s) . ■ update{A)(result(x,s')) if A involves x, 
= result(x,s') otherwise. 
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If S is a set, and < is a total order on the elements of S, then we let «S; £» denote the sequence 
consisting of the elements of S, in the order given by ^. 

In order to define serializability, we need to consider linear orderings of all sets of siblings in the 
action tree. Thus, let T be an action tree. A partial order p Q siblings is linearizing for T provided p 
totally orders all sets of siblings in T. A linearizing partial order p induces a total order, induced. ,. , on 
datastepSj., in the obvious way: if A and B are datasteps, with respective ancestors A' and B\ where 
A' and B' are siblings, then (A,B) € induced., if and only if (A\B') £ p. If A € datasteps T (x) and p is a 
linearizing partial order for T, let preds T (A) denote «{B € visible T (A,x): (B,A) € induced,, and B * 
A}; induced T ». Thus, preds T (A) denotes the sequence of datasteps whose effects on A's object 
are supposed to be visible to A. 

A linearizing partial order p for T is said to be a serializing partial order for T provided that 
label T (A) ■ resuft(x,preds T p (A)), for all A € dataste0s T (x). That is, the value actually seen by A for its 
data object is exactly the. result of the datasteps whose effects are supposed to be visible to A. T is 
said to be serializable provided there exists some serializing partial order for T. 

In this paper, we consider seriattzatriity of portions of an action free rather than an entire action 
tree. In particular, it might sometimes be useful to require seriaHzabttity only for those actions whose 
effects become "permanent", and not worry about those which get aborted. 

Thus, given an action tree, T, a new action tree, permP") . is defined as follows. 

- vert >ces^ rmfr) * visible T (U). (Lemma 5e shows that perm(T) is a tree;) 

- If A € vertices^^ , then stataiSj^^^A) ■ status^A). (This status is always "committed", 
except for U.) 

• If A € datasteps perm(T) . ^en label ( ^ rm(T) (A) = label T (A). 

The following lemma shows the useful property that all the vertices in a permanent subtree are 
visible to each other. 

Lemma 7: ff T is an action tree and A, B € verMceSpe-). then B € ^sibtepennrnW- 
Proof:. Since B € w «rt»<»»pe fm ( T) ■ vWbte^Uk Lemma 5d implies that B € vteibte^A). 
Then B € wsMe^-JA), since the status of each vertex is the same in T and perm(T). 
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In this paper, we wid use the correctness condition that any tree T created by our algorithm should 
have perm(T) serializable. (It is worth noting that one of the reasons that actions might be aborted is 
that a concurrency controller has discovered that allowing an action to proceed or commit will 
corrupt serializability. Thus, there is not reason to expect complete action trees to be serializable, 
and we focus on the permanent part of the trees only.) 

3.5. Discussion 

Note that the style in which serializability is defined here constrains the implementation less than 
the type of definition used in "traditional" concurrency control theory. The earlier definitions regard 
the data as external to the concurrency control algorithm; the algorithm is to take requests for data 
accesses and translate them into actual accesses, observing appropriate rules. Generally, the 
accesses performed by the concurrency control algorithm simply obtain the latest version of the data 
object. A clue that the earlier definitions are too constraining is that they do not apply unchanged to 
algorithms such as Reed's, which use sophisticated management of versions of the data. The earlier 
definitions require extensions [KP, BGJ to handle algprithms such as Reed's. These extensions still 
regard the data as external to the concurrency control algorithm, and so the modified correctness 
conditions contain explicit information about particular "wpwons" of ti» data objects. It seems, 
however, that the appearance of seriaJizabNtty, in terms of the values seen by the accesses, is really 
all that matters - it is possible that this appearance could be preserved by some algorithm which does 
not operate in terms of versions at all. 

The less constraining approach which is taken here is to regard the data as internal to the 
concurrency control algorithm, at least for the purpose of stating the basic correctness conditions. 
Thus, the definitions introduced in this paper are intended to be applicable to algorithms which use 
single versions of data objects, algorithms that use multiple versions of data objects, as well as to 
other implementations as yet unforeseen. 

4. An Algebra Based on Action Trees 

In this section, we begin to use the evenfcstate algebra framework. We use the set of action trees 

' as the state set for an algebra, and define a set of standard events which we would like to allow to be 

performed on action trees. We describe each event by defining the circumstances under which the 

event is to be allowed to be performed (the "precondition"), and the,resulting changes to be made in 

the action tree (the "effect"). 

We will use this algebra as a specification of correct abstract system behavior, the first level in our 
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correctness proof. Thus, we must ensure that the definition of this algebra includes the property that 
all action trees it generates have their permanent subtrees serializable. One way of doing this would 
be to include preservation of serializability explicitly in all the preconditions. It is a little simpler 
notationally just to state the serializability condition as a global invariant, to be maintained by ail 
events; thus, we follow this latter option. In terms of the algebraic model, there is an implicit 
precondition on each event stating that the result of the event satisfies the global invariant. 

We now define a set of events on action trees. That is, we define an algebra X = <A, a, n>, where 
A is the set of action trees, a is the trivial action tree with the single vertex U, with status 'active', and 
n contains the four kinds of events described in (a)-(d) below. We define the events as follows. First, 
we let C denote the set of all action trees, T, for which perm(T) is serializable. (In particular, a € 
C.) We place an implicit precondition on each event, stating that the result of the event is in C. Within 
this constraint, we define the domain by giving a precondition on action trees T, and use assignment 
notation to describe the effect of the event on T. 

In all events, we assume that A € act - {U}. 

(a) create A 

(a1) Precondition 

(a11)A$ vertices,.. 

(a12) parent(A) € vertices T • committed.,.. 

(a2) Effect 

(a21) vertices T ♦- verticeSy U {A}. 
(a22) status T (A) «*- 'active'. 

(b) commit A , A ( accesses 

(b1) Precondition 

(b11)A€active T . 

(b12) children(A) D vertices T C done r 

(b2) Effect 

(b21) status T (A) ♦- 'committed'. 



(c) abort A 



(d) Precondition 

(c11) A€active r 

(c2) Effect 

(c21) status^A) *- 'aborted'. 
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(d) perform. A € accesses, x = object(A), u € values(x) 

A,U 

(d1) Precondition 

(dll)AGactive r 

(d2) Effect 

(d21) status T (A) «- 'committed'. 
(d22) label^A) <- u. 

The meaning of the four events is as follows. The create A event creates (or "activates") a new 
action. It is required, of course, that A not be already in the tree. Its parent must be there, however, 
and must not already be committed (since a committed parent is assumed to have all of its children 
completed, and to depend on the completion of the particular set of children it had at the time of 
commit). Note that we allow A to be created after its parent has aborted. This might be reasonable in 
an implementation in which the two events occur at different nodes of a distributed system, for 
example. The effect of creating A is to add A to me tree, with status 'active'. 

The commit A event commits an active non-access action. It requires that A be active, and all its 
children be completed. The effect is to change the status to 'commited'. The abort A event is similar, 
but there is no requirement on the children - an active action can abort at any time. 

Finally, the perform . event actually performs a step on a data object. It requires that access A 
be active, and changes its status to 'committed'. It also records (in our action tree analog to the 
"log") the value u seen by the access. (It is unnecessary to record the value written, since that could 
be inferred from the value seen.) Note that we do not specify how the value u is supposed to be 
obtained by the perform event; it is permissible to record any value, as long as the serializability 
condition is preserved. 

We note that the only events which could cause the serializability constraint to be violated are 
commit and perform events. Thus, these are the only events for which the implicit precondition C is 
actually necessary. 

We also note that this algebra provides considerable flexibility in allowable sequences of events. 

5. Augmented Action Trees 

Now, we proceed to the second level of our proof. As before, it will be useful to define a data 
structure first, and then develop an algebra based on that data structure. The data structure to be 
used in the second level is called an "augmented action tree". It is very similar to an action tree, but 
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includes some extra information describing a sequence of versions for each data object. An 
augmented action tree is similar to a transaction conflict graph with resolution of conflicts, We stated 
earlier that we did not want to rely on definitions that depend on data versions, for our basic 
correctness conditions. However, the definitions which make specific reference to versions are still 
useful in conjunction with the approach of this paper. Their role is in supplying sufficient conditions 
for serializability, and thereby helping to organize correctness proofs. 

Serializability is defined for augmented action trees. It is seen that serializability for augmented 
action trees implies serializability for corresponding action trees. Moreover, serializability for 
augmented action trees has a cycle-free characterization similar to those in usual concurrency 
control theory. Therefore, this structure can be useful in proofs of serialutabiiity for action trees. 

Thus, it is at our second level that the interesting concurrency control arguments occur. 

5.1 . Augmented Action Tree Definitions 

An augmented action tree (AAT), T, is a pair (S,data T ), where S is an action tree and data T C 
sameobjectg is a partial order on datasteps g which totally orders the datasteps for each object. We 
extend action tree notation to T; for example, we write datasteos ^. to denote datastepsg. We also 
extend the definitions of yjsjbjfi, \M, djgad., linearizing : Induced , p^dj and seriatizable to T, by 
applying them to S. 

The assumed ordering on accesses to each data object imposes an ordering on siblings higher up 
in the tree. If T is an AAT, then tet siblinodata T denote {{A,B)'€ siblings: (C,D) € data,, for some C € 
desc(A), D € desc(B)}. 

We require notation for an access* visible predecessors in the version order. If A € datasteps T (x), 
then let v-dat^ A) denote {B € visible T (A,x): (B,A) € data,, and B * A},, The following is a technical 
lemma. 

Lemma 8: Let T be an AAT. Let p be a linearizing partial oncler for T, x € obj, and A € 
datasteps^x). Assume that induced- is consistent with date^. Then preds_(A) => 
«v-data T (A); dat^. ■ 

Proof: Straightforward. 



An AAT, T, is dataserializable provided there exists p, a serializing partial order for T, with the 
additional property that induced- is consistent with data r Thus, T is dataserializable provided that 
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it is serializable in a way that respects the conflict resolution partial ordering. Of course, data- 
serializability for A ATs provides a sufficient condition for serializability. 

5.2. Characterization of Data-Seriaiizability 

The analog of the usual characterization in concurrency Control theory is proved in this 
subsection. Namely, we give a characterization of data-serializability in terms of absence of cycles. 

First, we give a definition which says that the label of each access describes the correct object 
value which the access should see, if the versions of objects are ordered according to the data,, 
order. Formally, an AAT is version -compatible provided for every x € obj, and every A € 
datasteps^x), it is the case that labei T (A) = resu1t(x,s), where s = «v-data J (A); data T ». 

The next theorem contains the characterization result. 

Theorem 9: An AAT, T, is data-serial izable if and only if both of the following are true: 

a. T is version-compatible. 

b. There are no cycles of length greater than one in sibling-data,. 

Proof: Assume T is data-serializable, and obtain p, a serializing partial order for T for 
which induced.,, is consistent with data r 

a. Let A € datasteps T (x), s = «v-data T (A); data T ». Then label T (A) - 
result(x,preds T (A)), by the definition of serializability, = result(x.s), by 
Lemma 8. 

b sibling-data T C p. Thus, there are no cycles of length greater than one in 
sibling-data,.. 

Now assume a. and b. Let p be any partial order which totally orders all siblings and is 
consistent with sibling-data,.. Then p is linearizing for T, and induced.- is consistent with 
data,. We will show that p is a serializing partial order for T. Let x € obj, A € datastep&rfx). 
We must show that label T (A) = result(x,preds T (A)). Since T is version-compatible, we 
know ttiat labei^A) ■•■* result(x,3), where s ■ «v-data T ; data T ». Then Lemma 8 implies 
thats « preds Tp (A), as needed. 

D 
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6. An Algebra Based on Augmented Action Trees 

In this section, we define the algebra for our second level. This algebra will be based on the set of 
AAT's. We define events on AAT's, analogously to the definitions for action trees. Once again, we 
carry out the definitions within the event-state algebra framework. We then prove several basic 
properties of this algebra. Finally, we show that this algebra simulates the level 1 algebra. 

The second-level algebra can be understood as describing the "abstract effect" achieved by 
locking algorithms. (We do not actually describe a locking mechanism until later levels.) The major 
accomplishment of this section involves showing mat this abstract effect in fact guarantees the 
required serializability condition. The argument is relatively nontrivial, and is analogous to the usual 
correctness proofs for strict two-phase locking. Arguments Tor lateir levels will show that locking 
protocols actually achieve the required abstract effect. Thus, we have factored the correctness proof 
for a locking algorithm into two natural parts. 

6.1. Definitions 

We define a new algebra X = <A', a', IT>, where A' is the set of AAT's, cr' is the trivial AAT which 
has a single vertex U with status 'active', and the events in IT correspond closely to the events of Ji, 
and are designated by the same names. (We wfM rely on context to distinguish the two cases.) The 
only differences are that there is no global constraint corresponding to C, and perform . introduces 
two additional preconditions and an additional change. These new conditions can be thought of as 
capturing the abstract effect of a variant of Moss' locking algorithm. 

(d1) Precondition 

(d12) Let B € datasteps^x), BHve in T. Then B € visible^A.x). 

(d13) If A is live in T, then u ■ resultfx.s), where s « «vjsJble T (A,x); data T ». 

(d2) Effect 

(d23) data,. ♦- dat^ U {(B,A): B € datastepSrM} U {(A,A)}. 

The new preconditions say that a data access A must wait long enough so that all live accesses to 
the object have been committed, up to the level which matters to A. Also, the value used in the access 
is just the one resulting from the sequence of previous accesses, in the given data ordering. The new 
effect just involves adding appropriate new pairs to the end of the data ordering. 
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6.2. Preliminary Results 

This section contains two straightforward lemmas. The first describes some invariants preserved 

by the events. 

Lemma 10: If T is computable in J.', then the following are true. 

a. If A € vertices.,- and parent(A) € committed T , then A € done r 
b .U € active 

c. If (B,A) € data T , then either B is dead in T, or else B € visible T (A). 

d. If A € committed,, and B € desc(A) n vertices T then either B is dead in T 
else B € visible^A). 

Proof: Most of the arguments are straightforward. We argue cases c. and d. 

c. If B a A, the result is immediate. If B * A, then the only way we get (B,A) € data T is 
by virtue of some perform A event. That is, there exists T 1 such that fhT, such that the 
precondition for some step perform A is satisfied in T'. Thus, B is dead in T' or B € 
visible r (A). Therefore, B is dead in T or B € visible T (A). 

d. If B = A, the result is immediate. So assume A * B. Let A € committed,., B € 
desc(A) n vertice&p B live in T, and B I vteibte^A)- Then tfwe exist C, € desc(A) n 
anc(B), for which C = parent(D), C € committed T and p € active T . But this contradicts 
part a. 



The second lemma of this subsection describes properties that hold of a pair of AAT's, one of 
which is derivable from the other. 

Lemma 1 1 : Let T and T" be computable in JL', and assume that T h- T'. 

a. vertices T Q vertices T , committed- C committed.-, aborted. Q abortedp , and 

■ C<" 



b. If A € datasteps T then label T (A) = label T (A). 

c. If A € datasteps T and TB, A) € datap , then (B.A) € data^. 

d. If A € verticeSy, then visiWe T (A) Q visible T ,(A). 

e. If A € verticeSy and A is live in T\ then A is live in T. 

f. If A » parent(B) and A € committed,, and B € verticeSp , then B C done y . 
Proof: The only case that takes some arguing is f. Let A s parent(B), A € committed T 
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and B € vertices,, . Let T' be the result of <t> applied to T, and let T be the result of ♦. Then 
♦ contains a step m of the form commit A , and *♦ contains a step p of the form create B . 
m cannot precede p, since the precondition for p would be violated. So p precedes w. 
Then the precondition for v implies that B € done r 



6.3. Computability Guarantees Data-Serializability 

Note that there is no correctness conditfon for AAT's explicitly mentioning serializability. This is 
because for AAT's, computability alone is sufficient to guarantee serializability of perm(T), as we 
show in the next theorem. It is convenient to prove the two required properties separately, in two 
lemmas. The second of these two lemmas is the hardest result in the paper. 

Lemma 1 2: If T is computable in JL\ then perm(T) is version-compatible. 
Proof: Let A € datastepSp^^tx). We must show that u (» la b el p erm (T)( A )) = 
result(x,s), where s * ^v-datap^^fB); < la ta ptnn(T) ». A is inserted into the tree by a 
perform A u step *, so let the event sequence producing T be written as <&«*. Let T' 
denote the result of $, and T" the result of $*. The preconditions for » show that 
label r ,(A) » result(x,s'), where s' = «visible r (A,x); data^X By Lemma 11b and the 
definition of perm(T), it follows that label _(A) » result(x,s'). Thus, it suffices to show 
that s * s'. Since both data,, and data^^— are consistent with data T it suffices to show 
that s and s' contain the same elements. 

First, let B € s. Then (B,A) € data,, and so by Lemma 11c, B € datasteps r .(x). Since A 
is the only element in T" which is not in T', B € datasteps^x). Since A € vertices „ = 
visible^U), and U $.aborted T (by Lemma 10), it follows that A is live in T. Since B € 
visible^A), Lemma 6 shows that B is live in T. Thus, B is live in T', by Lemma 11 e. The 
precondition for « implies that B €vtsibfe T ,(A,x), so B € s'. 

Conversely, suppose B € s'. Then B * A since A t vertfceSp. Then (B,A) € data,,,, so 
by Lemma 11a, (B,A) € data,- By Lemma 1 1d, B € vWbte^A.x). By Lemma 7, it suffices to 
show that B € vertices pem|(T) * vis*!©^). But B € vtsible^A) and A 6 visible T (U), so 
Lemma 6c suffices. 

D 

Lemma 13: If T is computable in J.', then there are no nontrivial cycles in 
sibling-data^^. 

Proof: Assume the contrary: let (a»A Q ,A 1 A fc = a), k > 2, be a minimum length 

cycle such that (A.,A j+1 ) € sibling-data ^ for all i, <; i < k-1. Let a sequence * of 
events be defined so that T is the result of O. We will show that for each i, <. i < k-1 , 
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there exists a prefix *. of * such that if T is the result of Mr, then A. € done r , and A. 1 £ 
done T . If we fix i for which ♦. is of maximum length, and let T*tte the result of this 4^, then 
we see that A. + 1 C done r . But ^ + 1 is no longer than ¥., so Lemma 11 a implies that A. 
€ done T , , which is a contradiction. 

So fix i, < i < k-1. Then (Aj.A^^ € sibling-data ~. Then there exist B € 
desc(A.), C € desc(A. + 1 ) with (B,C) € data permfr) . Since B, C € vertices pg,^, it follows 
that (anc(B) U anc(C)) D proper-desc(U) C committed r Now, ♦has a prefix *», where * 
is a perform c u step. Let T' be the result of *, and T" the result of *». Lemma 11c 
implies that (B,C) € data T „ , so that B € datastep&p . Since B is live in T (using Lemma 
10b), Lemma 1 1e implies that B is live in T' . Then the precondition for v implies that B € 
visible T ,(C), which means that A. € anc(B) D proper-desc(lca(B,C)) C committed.,., C 
done T> . We must show that A i + C done T . ; if we can do this, then taking 4^ - ♦ yields the 
result. Assume A. + 1 € done T> . Then let D be the lowest ancestor of C for which D € 
done r ; it must be the case that D € anc(C) D proper-desc(lca(B,C)) Q committed T so D € 
committed r . Since C € active r , we know that D * C. Let E be the single element of 
children(D) n anc(C). Then E € done r . Then ''£€ vertices,, by Lemma I1f. This means C 
CverticeSp. This is a contradiction. 

D 

Theorem 1 4: If T is computable in J.', then perm(T) is data-serializable. 

Proof: Immediate from Lemma 12, Lemma 13 and Theorem 9. 
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6.4. Simulation 

Next, we show that J.' simulates A. We define a mapping h from X to JL as follows. If T = 
(S,data T ) is an AAT, then h(T) = {S}. If n is in II', then h(v) is just the event in 17 with the same name. 
Lemma 15: his a simulation of J. by J.'. 

Proof: (a) and (d) of the definition of a possibilities mapping are immediate. Property 
(b) follows immediately from the fact tiiat a' € doraam(ir') (since only additional constraints 
are added for X); note that Theorem 14 implies that the C-constraint is always satisfied. 
Property (c) is then straightforward. Thus, h is a possibilities mapping. Lemma 3 shows 
that h is a simulation. 

D 
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7. An Algebra Based on Version Maps 

In order to complete the proof of Moss' algorithm, it remains to prove that it achieves the abstract 
effect of locking described by X. It seems simplest to decompose this task further, first showing that 
a centralized locking algorithm simulates J.', and then showing that a distributed version of the 
algorithm simulates the centralized version. It turns out to be feasible to decompose the proof of the 
centralized locking algorithm still further. Namely, we first describe a locking-style algorithm which 
retains a large amount of useful information. Then we show that a more optimized locking algorithm 
simulates the algorithm which retains information. 

In this section, we develop the third level of the algorithm,' the locking-style algorithm which 
retains information. 

7.1. Version Maps 

As before, we begin by introducing another data structure, called a "version map". This one 
records some locking information for each object. As in Moss' algorithm, each object has a stack of 
locks, held at any time by a sequence of actions which are successive descendants. The version map 
records, for each object, and each action in some sequence of successive descendants, the 
sequence of accesses to the object whose result is available to the action. 

Thus, a version map is a partial mapping V from obj x act to sequences of accesses, such that the 
following properties are satisfied: 

- V(x,U) is defined for all x, 

- each V(x,A) consists of accesses to x, 

- for each x, if V(x,A) and V(x,B) are both defined, then either A € desc(B) or B € desc(A), 

- if V(x,A) and V(x,B) are both defined and B € desc(A), then V(x,B) i$an extension of V(x,A). 

Thus, for each x, V is defined only for transactions which lie on some chain of ancestors; V is not 
necessarily defined for all transactions on the chain, but only for some subset of the transactions on 
the chain. 

If A is the least action for which V{x,A) is defined, then we call A the p rincipal action for x in V; in 
this case, if resultfx.Vfx.An a u. wa sav that u i&Jha princ jya f ya j uj » of x in V. 
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7.2. Definition of the Algebra 

We define another algebra, X' = <A", a", n">, as follows. A" is the set of pairs (T,V), where T is 
an AAT and V is a version map. a" consists of the trivial AAT consisting of a single node U with status 
'active', and the version map which has V(x,U) equal to the empty sequence, for all x, and is otherwise 
undefined. II" consists of the six events defined below in (a)-(f). 

In all the events to follow, we assume that A € act - {U}. Events (a)-(c) are identical to (a)-(c) of 
Ji\ Some changes are needed in the perform event, and there are two new events which manipulate 
locks. 

(d) perform A u , A € accesses, x = objectCA), u € vaiues(x) 

(d1) Precondition 

(d1 1) A € active,.. 

(d12) {B: V(x,B) is defined} C proper-anc(A). 

(d13) u is the principal value of x in V. 

(d2) Effect 

(d21) status T (A) *- 'committed'. 

(d22) labet r (A) *- u. 

(d23)data T 4-dati^ U «B i A): B € acceaseSrM} U «A,A)}. 

(d24) V(x,A) ♦- V(x,B) ° (A), where B is theprincipal action in V. 

(e) release-lock A x , x € obj 

(e1) Precondition 

(el 1) V(x,A) is defined. 
(e12) A € committed.,.. 

(e2) Effect 

(e21) V(x,parent(A)) «- V(x,A). 
(e22) V(x,A) «- undefined. 

(f) lose- lock Ax iX€obj 

(f1) Precondition 

(f11)V(x,A) is defined. 
(H2)AisdeadinT. 

(f2) Effect 

(f21)V(x,A)«- undefined. 

Thus, (d) says that a perform . event can only be carried outwhen the current lock-holders are 
all proper ancestors of A, and when u is the proper value which should be provided to A. This event 
has the new effect of augmenting the version map by giving a "tock" to A: A gets a sequence of 
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versions which is exactly that held by the previous principal action, concatenated with a new version 
for A. Event (e) allows a lock to be released by a committed action; its effect is to pass the lock up to 
its parent, so that its parent now obtains the sequence of versions previously held by the child. Event 
(f) allows a lock to be released by a dead action. 

7.3. Basic Properties 

In this subsection, we present a simple lemma stating some important invariants preserved in jL". 
Lemma 16: If (T,V) is computable in .A", then the following are true. 

a. If V(x, A) is defined, then A € vertieea,.. 

b. If B € datasteps T (x) and B is live in T, then there exists A € anc(B) with V(x,A) 
defined and B an element of V(x,A). 

c. If V(x, A) is defined, Mien each element of V(x, A) is in visible T (A). 

d. If V(x, A) is defined, then the elements of V(x, A) are in data T order. 

Proof: Straightforward. We argue b., for example. Immediately after an event 
perform B u occurs, we see that V(x,B) is defined, and B € V(x,B). Assume inductively mat 
there is some ancestor, C, of B with V(x,C) defined and B € V(x,C). Since B remains live, 
mere are no steps of the form lose-lock c x . Thus, if V[x,Q is ever changed, it must be 
because of a release-lock step. There are two possibilities. First* the change could occur 
because of a release-lock c x step. But such a step causes V(x,parent(C)) to take on the 
oW value of V(x,C), thereby preserving the needed property. Second) the change could 
occur because V(x,C) gets redefined to be Hie previous value of V(x,D), where D € 
children(C). But because the successive sequences are extensions of each other, B is an 
element of V(x,D) as well. Thus, the needed property is preserved in this case also. 

D 

7.4. Simulation 

Define a mapping h' from JL" to -4. 1 as follows, h' maps (T,V) to {T}, and maps events (a)-(d) to 
events of the same name, and events (e) and (f) to A. 
Lemma 1 7: h' is a simulation of J.' by X". 

Proof: It suffices to show that h' is a possibilities mapping. Properties (a) and (d) are 
easy to check. We consider property (b). Let w' € II", where h'{»') » » € IT. Then w' is 
either of the form create A , commit A , abort A or perform A . In the first three cases, the 
property (b) is easy to check. So assume that «' is of the form perform A . Assume (T,V) 
is computable in J." and »' is defined on (T,V), yielding fF,V). We must show that 
perform A „ (i.e. the event of JL') is defined on T. Let x * object(A). 
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Condition (d11) for X follow immediately from the corresponding condition for -4.". 
We consider (d12). Let B € datasteps^x), and assume that B is live in T. Since (T,V) is 
computable in JL", Lemma 16 implies that there is some C € anc(B) for which V(x,C) is 
defined and for which B is an element of V(x,C). Then Lemma 16 implies that B € 
visible T (C). Since it' is defined on (T,V), (d12) for JL" implies that C € anc(A). Since A € 
vertices,, Lemma 5 implies that B € visible T (A), as needed. 

Next, we consider (d13). Assume A Is live in T, and let s * «visibte T (A,x); data T ». We 
must show that u * result(x,s). Let B be the principal action for x in V. Condition (dl3) for 
A" implies that u = result(x,V(x,B)). It suffices to show that s and V(x,B) are identical. 
Since the elements of V(x,B) are in data T order (by Lemma 16), it suffices to show that s 
and V(x,B) contain the same set of elements. 

First assume C is in s, i.e. C € visible^A.x). Since A is five in T, Lemma 6 implies that C 
is live in T. Then Lemma 16 implies that there exists D € anc(C) for which V(x,D) is defined 
and C is an element of V(x,D). Since B is the principal element for x in V, the sequence 
extension property of the definition of version maps implies fltat C is also an element of 
V(x,B). 

Conversely, assume that C is an element of V(x,B). Lemma 16 implies that C € 
vistble^B). Condition (d1 2} for JL" implies that B € anc< A). Thus, C € visible^A). 

It is easy to check that property (c) holds, once we know that the definability conditions 
correspond. Therefore, h' is a possibilities mapping. 

□ 

Theorem 18: h ° h' is a simulation of JL by J.". 

Proof: Immediate from Lemmas 15, 17 and 1. 

□ 

8. An Algebra Based on Value Maps 

The previous section described a version of a locking algorithm in which considerable information 
(the sequences of versions) were retained. In this section, we describe the fourth level of our 
algorithm. In this level, we optimize the locking algorithm of the previous level by condensing some of 
the information retained. Namely, it turns out not to be necessary to retain the complete sequences of 
versions; rather, we can manage by retaining only the latest value of the object for each action. 

Note that we can prove a simulation result after eliminating information precisely because 



29 



possibilities maps are able to yield sets of states rather than single states. The sets of states serve to 
replace the eliminated information. 

8.1. Value Maps 

As before, we introduce another data structure. This one records, for each object and action, the 
latest value of the object which is available to the action. 

A value mao is a partial mapping V from obj x act to vaiues(obj), such that the following properties 
are satisfied: 

- V(x,U) is defined for all x, 

■ each V(x,A) € values(x), and 

- for each x, if V(x,A) and V(x,B) are both defined, then either A € desc(B) or B € desc(A). 

If A is the least action for which V(x,A) is defined, then we call A the Principa l actipn for x in V; in 
this case, if V(x,A) = u, we call u the princ ip a l value of x in V. 

If V is a version map, then let eval(V) be the value map defined on exactly the same domain, so 
that eval(V)(x,A) = result(x,V(x,A)). 

Lemma 1 9: Let V be a version map, x € obj. Then the principal action for x in V is the 
same as the principal action for x in eval(V), and the principal value of x in V is the same as 
the principal value of x in eval(V). 
Proof: Straightforward. 

□ 

8.2. Definition of the Algebra 

We define another algebra, X"' = <A'", a'", n m >, as follows. A'" is the set of pairs (T,V), where T 
is an AAT and V is a value map. a'" consists of the trivial AAT consisting of a single node U with 
status 'active', and the value map which has V(x,U) equal to init(x), for all x, and is otherwise 
undefined. 11'" consists of six events (a)-(f). 

In ail the events, we assume that A € act - {U}. Events (a)-(c), (e) and (0 are identical to the 
corresponding events of J.". Event (d) is also identical, except for the change indicated below. 
(d2) Effect 
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(d24) V(x,A) - update(A)(u). 

8.3. Simulation 

Define a mapping h" from X"' to JL" as follows. Let h"(T,V) = {(T,W): eval(W) » V}. h" maps all 
events to events of the same name. 

Lemma 20: h" is a simulation of J." by JL'". 

Proof: It suffices to show that h" is a possibilities mapping. Properties (a) and (d) are 
easy to check. Let »' € II'". If v' is any event except for a perform event, then properties 
(b) and (c) are immediate. 

Assume »' is perform Au . Assume (T,V) is computable in JL"\ (T,W) € h'"(T,V), (T,W) 
is computable in .X", w' is defined for (T,V) and (T',V) ■ w'(T,V). Lemma 19 implies that 
property (b) holds, i.e. that m = perform . „ is defined on (T,W). It follows from the effects 
of the two events that w(T,W) = (T',W) for some version map W. In order to show 
property (c), it suffices to show that eval(W') « v". Since eval(W) a V, we only need to 
consider the values which change because of the present event, i.e. we need to show that 
result(x,W*(x,A)) * V(x,A). But result(x,W*(x,A)) = result(x,W(x,B) • (A)), where B is the 
principal action for x in W, = update(A)(result(x,W(x,B))) l « update(A)(V(x,B)) since 
eval(W) s v. But B is the principal action for x in V, by Lemma 19, so u * V(x,B). 
Therefore, the latest term in the extended equality is equal to update(A)(u), which is equal 
to V'(x.A) by definition. 

D 

Theorem 21 : h • h' • h" is a simulation of Jk by JL"'. 

Proof: Immediate from Lemmas 18, 20 and 1. 

D 

9. The Algorithm 

The only remaining task is to describe a distributed locking algorithm, and show that it simulates 
the previous algorithm. In this section, a slightly simplified version (which doesn't distinguish read 
and write steps) of Moss' algorithm is described using a distributed algebra, 

9.1. Notation and Definitions 
Let [k] denote {1,...,k}. 

We fix a particular k, as the number of nodes. For convenience, we designate the nodes by 
identifiers in [k]. 
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Let home : (act - {U}) U obj -» [k], with home{A) = home(object(A}) for all A € accesses. Thus, 
home partitions the actions and objects among the nodes. Let ori g i n ; (act • {U}) -> [k] be defined so 
that origin(A) = home(A) if parent(A) = U, and = home(parent(A)) otherwise. 

In order to describe the local state of each node, K is convenient to define a generalization of 
action trees. Thus, we define an action summary T to consist of components vertice^ . active .,.. 
committed T . and aJ2fidfid T > where vertices,, is any finite subset of act (not necessarily closed under 
the parent operation), and the remaining three components form a partition of vertices T . The notation 
done T and slaiUSr is also extended in the obvious way. If T and T are action summaries or action 
trees, w e say that T < T' provided that vertices T C vertices- , and correspondingly for committed T 
and aborted T . We also define T" * T U T' so that vertices,,.. * vertices^ U vertices T , , and similarly 
for committedp, and aborted r .. An action summary wW be used lo describe partial knowledge of the 
latest status of the transactions. 

9.2. Definition of the Algebra 

We describe the algorithm as the algebra, a « <B, r, P>, which is distributed over [ » [k] U 
{'buffer'}. The elements of [k] correspond to k nodes of a distributed system, and the buffer 
corresponds to the entire message system. The components are defined as follows. Let B be the 
Cartesian product of state sets B j( where i € I. 

If i € [k] (that is, if i corresponds to a node), then B { consists of the values of two variables, i.T 
which contains an action summary, and i.V, which contains a value map. The action summary 
recorded in i.T represents node i's knowledge of the latest status of various transactions. The value 
map in i.V contains the latest value map information for all objects whose home is i. 

If i = 'buffer', then B, consists of the values of variables M., j € [k], each of which contains an 
action summary. The action summary in M. represents all the infownation which has been sent to 
node j during the entire computation. 

The initial state r is a vector of initial states for all the components. If i € [kj, then r, has i.T 
initialized as the trivial action summary, having no vertices, and i.V initialized so that i.V(x,U) « init(x) 
for all x with home(x) - i, and otherwise undefined. If i ■ 'buffer*, then r { has each M. equal to the 
trivial action summary. 

The algorithm has eight kinds of events. Six correspond closely to the six events of J."' - four 
record the creation, commit and abort of actions and the performance of data accesses and two 
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manipulate locks. The other two correspond to the sending and receiving of messages. The events 
are listed below. As usual, we present them by listing a precondition and Hie effect on the state. In 
addition, we define d(w), the doer of each step. 

In all cases, we assume that A € act - {U}; 
(a) create , A , origin(A) ■ i 

(a1) Precondition 

(a11) A C i.verticeSp. 

(al2) If parent(A) * U then parent{A) € i.verticeSy - i.committed T . 

(a2) Effect 

(a21) i.vertices T ♦- i.vertices T U {A}. 
(a22)wrtatus T (A) ♦- 'active'. 

(a3) Doer: i 

(b) commit. ., A i accesses, home(A) 
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(b1) Precondition 

(bnjACi.active,.. 

(b12) children(A) D i.vertices T C i.done r 

(b2) Effect 

(b21) Lstatus^A) «- 'committed'. 

(b3) Doer: i 

(c) abortj A , A £ accesses, home(A) * i 

(d) Precondition 

(c11) A€ i.active r 

(c2) Effect 

(c21) I jMniftA) «- 'aborted*. 

(c3) Doer i 

(d) perform, A u , A € accesses, x ■ object(A), u € valuesfx), 
home(A) « i,'hbme(x) • i 

(d1) Precondition 

(d11)A€i.active r 

(d12) {B: i.V(x.B)} is defined} C proper-anc(A). 

(d13) u is the principal value of x in i.V. 

(d2) Effect 

(d21) i.status T (A) <- 'committed'. 
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(d22) i.V(x.A) <- update(A)(u). 
(d3) Doer: i 

(e) release-lock. . „, home(x) = i 

(e1) Precondition 

(e11)i.V(x,A) is defined. 
(e12) A € i. committed.,.. 

(e2) Effect 

(e21) i.V(x,parent(A)) «- i.V(x.A). 
(e22) i.V(x.A) ♦- undefined. 

(e3) Doer: i 

(f) lose-lock, . . home(x) ■ i 

(f1) Precondition 

(f11)i.V(x,A) is defined. 
(f12) anc(A) n i.aborted T * 0. 

(f2) Effect 

(f21) i.V(x,A) -<- undefined. 

(f3) Doer: i 

(g) send. . T , , T* an action summary 

(g1) Precondition 

(gll)T' <i.T. 

(g2) Effect 

(g21)M.^-MjUr. 

(g3) Doer: i 

(h) receive, T , , T an action summary 

(hi) Precondition 

(h11)T'^M r 

(h2) Effect 

(h21)i.T«-i.TUT'. 

(h3) Doer buffer 

Thus, (a) ■ (f) correspond closely to (a) • (0 of JL"\ Events (g) and (h) are the new communication 
events. These conditions say that any communication is allowed at any time, which sends any of i's 



34 



action summary information from i to j. 

Lemma 22: ^B is an algebra, which, is distributed over I using d. 
P roof : Straightforward . 



9.3. Simulation 

Now define an interpretation h'" from *B to X" by mapping the first six types of events to the 
events of the same name, suppressing the index in [k], and mapping the other two types of events to 
A. 

If b € B, then we add "[b]" to the end of a variable name to denote the value of that variable in 
state b. 

For each i € I, we define a mapping h. from B to flfA"') as follows. If I € [k], then (T,V) € h,(b) 
exactly if (T,V) is computable in X" and the following are true: 

- vertices,. D {A: origin(A) = i} C i.vertices T [b}£ vertices,.. 

- committed T H {A: home(A) * i} C i.committed T [b] Q committed,.. 

- aborted.,. D {A: home(A) = i} C i.aborted T [b] C aborted,.. 

- i.V[b] is the restriction of V to {(x,A): home(x) ■ i}. 

If i = 'buffer', then (T,V) G h.(b) exactly if (T,V) is computabie in X" and M.[b] < T for each j € [k]. 

If (T,V) € hj(b), then we also say that (T,V) is iconsistent with b. 

We now proceed to prove lemmas corresponding to the properties required in the definition of a 
local mapping. The proofs are long, but are very straightforward case analyses. 
Lemma 23: For all i € I, a"' €. h,(T). 
Proof: Immediate from the definitions. 

□ 

Lemma 24: Assume i € I. Assume »' € P, d(ir) ■ i, m * h'"(ir') € IT", a and a' are 
computable in X" and % respectively, a € h.(a') and a' € domain(v'). Then a € 
domain(v). 
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Proof: Let a be (T,V). 

First, assume that v is create. A , so that * iscreale A . Then origin(A) » i. Since a' € 
domain(w'), A $ i.vertices T [a']. Since (T.V) is i -consistent with a', A i vertice&p thus 
showing (all). If parent(A) = U, then the fact that (T,Y) is computable and Lemma 16 
imply that parent(A) € active T , thus showing (a12) for tfiis case* On the other hand, if 
parent(A) * U, then the precpndition for ?' shows that parent(A) € i.vertices^a'] - 
i.committed T [a']. The fact that (T,V) is i-consistent with, a' implies that parent(A) € 
vertices^. ■ committed.,.. Thus, (a1 2) holds. 

Second, consider w' = commit. A , so that * is commit A . The precondition for »' 
shows that A € i.activeyla'J. The fact that (T.V) is i-consistent, with a' implies that A € 
active T , thus showing (b1l). The precondition for V shows that children(A) D 
i.vertices^a'] C i.done^a']. The fact that (T,V) is i-consistent with a' implies ttiat 
children(A) n vertices^. Q done,., thus showing (b12). 

Third, assume ir' = abort. A , so that v is abort A . This case is similar to the first half 
of the previous case. 

Fourth, assume »' ■ perform^ Au , so that » is perform A . Then home(A) » i. 
Assume object(A) = x, so that home(x) » i. (dll)iaargued as in the preceding two cases. 
We show (d1 2). Choose B so that V(x,B) is defined. Since (T,V) is i-consistent with a' and 
home(x) = i, i.V(x,B)[a'] is also defined. The precondition for »' implies that B € proper- 
anc(A), as needed. Next, we show (d13). The precondition for »' implies that u is the 
principal value for x in i.vta']. Since (T,V) is i-consistent with a', u is also the principal 
value for x in V, as needed. 

If v' is one of (e) or (f), then »' involves some x with home(x) ■ i. Assume that *' 
involves A. The precondition for »' implies that i.V(x,A)[a'] is defined. Since (T,V) is i- 
consistent with a', it follows ttiat V(x,A) is defined, thus shewing both (e1 1 ) and (f 11). 

If v* is a release-lock. . , step, then the precondition for »' implies that A € 
i-commrtted^a']}. Since (T,V) is i-consistent with a', A € commrtted r thus showing (el2). 

Rnally, if w' is a lose-lock^ step, the precondition, for *' implies that anc(A) H 
i.aborted^a'] * 0. Since (T,V) is i-consistent with a', it follow* that A is dead in T, thus 
showing (f 12). 

a 

Lemma 25: Assume I, J € I. Assume »' € P, d(w') - i, * - h , "(»') € OP"*, a and a' are 
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computable in JL'" and $, respectively, a € h^a') D hia'), and a' € domain(w'). If b' = 
tr'(a'), then v(a) € h.(b'). 

Proof: Let a => (T,V) and *(a) = (T',V). Lemma 24 implies that a € domain(ir). 

If j * i, then it is easy to see that all the containments are preserved, since the sets of 
actions on the right sides are only increased, while the sets on the left sides are 
unchanged. The property involving V is also easily seen to be preserved. So assume j = i. 
We consider the six kinds of events in turn. 



First, assume w' is of the form create. A , commitj A or abort, A . Then V = V, and T' 
is exactly like T except that A is added to vertices^ corrimltted T or aborted T as appropriate. 
Also, b' is just like a' except that A is added to ivertJceSj., i.committed T , or i.aborted T , as 
appropriate. Since (T,V) is iconsistent with a', it is easy to see that all the containments 
change in such a way as to insure that (T'.V) is iconsistent with b'. 

If »' is of the form perform. . , then home(A) = i. Let x » object(A). Then home(x) 
- i. T' is just like T except that A is added to committed,, and & given label u, and data T is 
augmented with all pairs in {(B,A): B € datasteps T (x)} U (A.A). V is just tike V except that 
V'(x,A) is defined to be update(A)(u). b* is just Hke a' except that A is added to 
i.corhmftted T , and i.V(x.A) is defined to be update{A)(u). Since (T,V) Is iconsistent with a', 
it is easy to see that (T'.V} is iconsistent vWthV: most of me properties are immediate. 
We just check the fast property; the only change Involves A. We have already noted that 
i.V(x,A)[b'] = updaie(A)(u} « V(x,A). This is as needed. 

If ir' is of one of the forms (e) or (f), then T ■ T and i.T|b'J = i.T[a']. Thus, it is clear 
that the containments are all preserved. It is also easy to check that the final property is 
preserved. 

. □ 

Lemma 26: Assume i, j € I. Assume *' € P, dfv') = i, h(»') = A, a and a' are 
computable in J.'" and % respectively, a € h,(a') D h.(a'), and a' € domain(»'). If b' ■ ■ 
ir'ta'J.thenaCh^'). 

Proof: Let a - (T,V). 

First, assume that v' is sendj ,, T ,. If j * 'buffer', S»eh b'. = a' , and the conclusion is 
-immediate. So assume that j = 'biiffer'. SftUce (T,V) Is j-conalstent vAXh a', each action 
summary M,[a'] < T. The precondition for »* implies that T <, i.T[a']. Since (T,V) is 
i-consistent with a', ft foHows that i.T[a'] ^ T, and hence T' ^ T. Now, each M,[b'] <. M,[a'] 
U T'. Therefore, each M f [b'] ^ T, as needed. 
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Next, assume that v' is of the form receive,, T „ so tiiat i ■ 'buffer'. The only nontrivial 
case is j = i'. We must show that j.T[b'] < T. But j.TJb'] = j.T[a'J U T'. The j-consistency 
of (T,V) with a' shows that j.T[a'} < T. The precondition for it' shows that T' ^ MJa'J. 
Since (T,V) is i-consistent with a', M.[a'] < T. Thus, T' <, T. Therefore, j.Tfb'] < T, as 
needed. 

D 

Lemma 27: h'" and h { , i € I, form a local mapping from $ to X". 

Proof: Immediate from Lemmas 23, 24, 25, and 26. 



□ 



Now extend h m to B U P, by defining h"'(b) - n (€ ,h,(b). 
Lemma 28: h'" is a simulation of J.'" by 9. 
Proof: Immediate by Lemmas 27 and 4. 



The main correctness theorem now follows. 

Theorem 29: The mapping h • h' • h" • h'" is a simulation of J. by tt. 
Proof: Immediate from Lemma 28, Lemma 1 and Theorem 21. 



10. Conclusions 

In this paper, we have presented a detailed proof of a variant of Moss' concurrency control 
algorithm for nested transactions. Along the way, we have developed a substantial amount of basic 
theory for nested transactions. The basic framework, especially the definitions and results involving 
visibility, should be of further use. 

There is much more to be done, however. The framework presented in this paper is not powerful 
( enough to describe all the correctness conditions one might want for nested transactions. In 
particular, we do not model the correspondence between what the system does and what it is 
requested to do by the transactions. This deficiency is at least partly due to the fact that we have 
chosen not to model the transactions explicitly. In order to describe everything we might want, we will 
probably have to incorporate some type of model for the transactions into the framework. 

We have only proved correctness of one variant of Moss' algorithm. There are many other related 
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algorithms for which similar proofs ought to be developed. Certainty, Moss' complete algorithm (with 
a distinction between read and write operations) should be proved correct; we do not expect this 
extension to be very difficult. The orphan algorithm mentioned in the introduction should be verified; 
obtaining an understandable proof for this algorithm seems like a much harder task. Also, other 
implementations for nested transactions, such as Reed's, should be proved correct. In would be 
interesting to see to what extent the theory developed for one of these algorithms is usable for the 
others. 

The proof presented here has a very interesting structure. It describes algorithms as algebras, 
and uses a series of five levels of abstraction. Correctness is shown using four simulation mappings. 
The interesting and nontrivial concurrency control arguments are made in proving the correctness of 
the first two simulations. The correctness of the first simulation expresses the fact that certain 
conditions imply serializability. The correctness of the second simulation expresses the fact that a 
form of locking satisfies these conditions. Successive levels refine the algorithm, providing more 
implementation detail, condensing the information that is kept, and distributing the processing. 
Proofs at these lower levels are straightforward checks of the local mapping properties. 

There is more to be done in exploring the usefulness of this proof structure for other distributed 
algorithms. 
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